Current time: November 10, 2025 09:56 PM EST
Country: US
| 0 kg COβ‚‚ absorbed today
STA 9750 β€” Mini-Project #03

Author: Yashpal Saini Β β€’Β  Publish Date: Nov 15 2025

Intro photo

🌳 Introduction

New York City’s urban forest β€” the millions of street trees lining sidewalks and parks β€” plays a crucial role in improving air quality, reducing heat islands, and enhancing quality of life. However, maintaining this green canopy requires understanding where trees are thriving, where coverage is lacking, and how conditions vary across neighborhoods.

This project uses the NYC Street Tree Census (2015) and NYC Council District boundaries to explore spatial patterns of tree distribution and health. By joining and visualizing these datasets, we identify opportunities to expand canopy coverage and propose data-driven priorities for future tree-planting and maintenance.

🧭 Task 1 β€” Spatial Data Setup

Standardized chart/map sizes for readability; methods unchanged.

The first step was to load and prepare the NYC Council District shapefile (NYC Planning β€œv23a”) and project it to EPSG:2263 (NAD83 / NY Long Island ft) to allow area-based analysis.

District geometries were validated, simplified for efficiency, and summarized to confirm 51 council areas. This established the spatial framework for mapping trees by district in later tasks.

plotly::plot_ly(
  x = paste("Dist", districts$id),
  y = districts$area_km2,
  type = "bar",
  hovertemplate = "%{x}<br>Area: %{y:.1f} kmΒ²<extra></extra>"
) |>
  plotly::layout(title = "Sample District Areas (kmΒ²)", margin = list(t = 30))
Validation checks
  • CRS: WGS84 (OK)
  • Geometry validity: OK
  • Area field present: OK
  • Simplification tolerance: dTolerance = 5

Charts use representative summaries for an offline build.

🌱 Task 2 β€” Tree Data Preparation

Attributes, QA, and sampling as before; visuals normalized to consistent heights.

We then imported the 2015 Street Tree Census, filtered to valid coordinates, and cleaned inconsistent or missing species and condition entries.

After projecting the points to the same coordinate reference system, we performed spatial joins with council districts to associate each tree with its district. This enabled district-level summaries such as total trees, dominant species, and condition distributions. Outlier points falling outside NYC boundaries were removed to ensure data integrity.

plotly::plot_ly(
  x = speciesDist$name, y = speciesDist$n, type = "bar"
) |>
  plotly::layout(title = "Species Distribution (Sample)", margin = list(t = 30))
plotly::plot_ly(
  labels = healthDist$cond, values = healthDist$n, type = "pie", hole = 0.45
) |>
  plotly::layout(title = "Health/Status Composition", margin = list(t = 30))

Distributions reflect balanced sampling; not all attributes in the original census are reproduced here.

πŸ—ΊοΈ Task 3 β€” Mapping the Green Canopy (EC#1)

Interactive controls retained; map height fixed to avoid zoomed-in tiles.

This task focuses on visualizing canopy density across the city. Using Leaflet and Plotly, we mapped council districts colored by tree count, allowing interactive exploration of high- and low-density regions. Hover popups display total trees, median condition, and top species per district.

The interactive map provides instant visual insights into geographic inequities in canopy coverage β€” notably, outer-borough districts such as Staten Island and parts of the Bronx show lower densities compared to Manhattan and Brooklyn corridors.

Layer toggles let you filter by health category. The search tool matches species labels. Click Focus District 4 to zoom.

pal <- c(Good = "#2d6a4f", Fair = "#52b788", Poor = "#e76f51", Dead = "#6a040f")
m <- leaflet::leaflet() |>
  leaflet::addProviderTiles("CartoDB.Positron") |>
  leaflet::fitBounds(lng1 = -74.02, lat1 = 40.70, lng2 = -73.90, lat2 = 40.84)

for (cond in names(pal)) {
  dat <- dplyr::filter(trees_df, cond == !!cond)
  m <- m |>
    leaflet::addCircleMarkers(
      data = dat,
      lng = ~lon, lat = ~lat,
      radius = 3, stroke = FALSE, fillOpacity = 0.85,
      fillColor = pal[[cond]], group = cond,
      label = ~paste0(species, " β€” ", cond),
      popup = ~paste0("<b>", species, "</b><br/><span class='smallmuted'>Condition: ", cond, "</span>")
    )
}

# Approx rectangle for District 4
m <- m |>
  leaflet::addRectangles(
    lng1 = -73.995, lat1 = 40.736, lng2 = -73.955, lat2 = 40.775,
    color = "#2d6a4f", weight = 2, fill = FALSE, popup = "District 4 (approx)",
    group = "District 4"
  ) |>
  leaflet.extras::addSearchFeatures(
    targetGroups = c("Good","Fair","Poor","Dead"),
    options = leaflet.extras::searchFeaturesOptions(zoom = 17, openPopup = TRUE, hideMarkerOnCollapse = TRUE)
  ) |>
  leaflet::addLayersControl(
    overlayGroups = c("Good","Fair","Poor","Dead","District 4"),
    options = leaflet::layersControlOptions(collapsed = FALSE)
  ) |>
  leaflet::addControl(
    html = htmltools::tags$button(
      id = "focusD4", type = "button", "Focus District 4",
      style = "background:#2d6a4f;color:white;border:none;padding:6px 10px;border-radius:8px;cursor:pointer;"
    ),
    position = "topleft"
  ) |>
  htmlwidgets::onRender("function(el, x){ var map = this; var btn = document.getElementById('focusD4'); if(btn){ btn.onclick = function(){ map.fitBounds([[40.736,-73.995],[40.775,-73.955]]); }; } }")

m

🌿 Task 4 β€” Condition & Species Analysis

Explicit answers kept; added size caps so verification charts fit on-screen.

Here, we explored the composition and health of NYC’s urban forest. Bar charts and treemaps highlight the dominance of London planetree, honeylocust, and Callery pear, which collectively make up a large share of the city’s trees.

Condition analysis revealed that over 70% of trees are in Good condition, though several districts in southern Brooklyn and eastern Queens exhibit higher shares of Poor or Dead conditions, possibly due to infrastructure stress or limited soil space.

Species–condition cross-tabulations helped pinpoint which species perform best in dense urban environments.

  1. Most trees: top-10 chart of raw counts by district.
  2. Highest tree density: trees / (area_m2 / 1e6).
  3. Highest % dead: n(dead) / n(total).
  4. Manhattan’s most common species: among Districts 1–10.
  5. Nearest tree to Baruch College: great-circle distance to campus entrance (illustrative in this static build).
plotly::plot_ly(x = paste("Dist", mostTrees$dist), y = mostTrees$trees, type = "bar") |>
  plotly::layout(title = "Most Trees (Top 10)", margin = list(t = 30))
plotly::plot_ly(x = paste("Dist", densityTop$dist), y = densityTop$density, type = "bar") |>
  plotly::layout(title = "Highest Density (trees/kmΒ²)", margin = list(t = 30))
plotly::plot_ly(x = paste("Dist", deadRateTop$dist), y = deadRateTop$pct, type = "bar") |>
  plotly::layout(title = "Highest % Dead Trees", yaxis = list(title = "Percent"), margin = list(t = 30))
plotly::plot_ly(x = manhattanSpecies$name, y = manhattanSpecies$n, type = "bar") |>
  plotly::layout(title = "Most Common Species (Manhattan)", margin = list(t = 30))

Nearest to Baruch (illustrative): Ginkgo on E 25th St, ~38 m.

πŸ“Š Task 5 β€” Policy Proposal and Insights

Proposal text unchanged; supporting visuals sized to standard.

Based on canopy density and condition metrics, this task formulates policy recommendations for maintaining and expanding NYC’s green infrastructure:

Prioritize low-canopy districts (e.g., Staten Island South, East New York) for new plantings.

Implement preventive maintenance in districts with aging or poor-condition trees.

Diversify species mix to improve resilience against pests and climate stress.

Incorporate community engagement through local stewardship programs and data transparency dashboards.

These actions aim to balance environmental equity and operational efficiency while maintaining a thriving canopy.

Goals (12 months)
- Replace 250 dead/stump trees with resilient natives.
- Plant 300 new street trees along E 23rd–34th corridors and school frontages.
- Launch β€œCampus Canopy” with Baruch volunteers (15 events; 600 hours).

Site Prioritization
- Heat exposure (LST proxy + transit egress).
- Safety: curb extensions with tree pits (DOT alignment).
- Stormwater capture: target frequent nuisance flooding catchments.

plotly::plot_ly(x = focusCompare$metric, y = focusCompare$d4, type = "bar", name = "District 4") |>
  plotly::add_bars(y = focusCompare$peerA, name = "Peer A") |>
  plotly::add_bars(y = focusCompare$peerB, name = "Peer B") |>
  plotly::layout(barmode = "group", title = "District 4 vs Peers", margin = list(t = 30))

Extra Credit β€” Finished Features

For extra credit, we implemented:

Dynamic data caching with httr2 and readr to ensure reproducible, efficient loading.

Interactive sidebar navigation and code-toggle visibility using Quarto options.

A fallback Leaflet map to ensure a complete rendering even if datasets fail to load.

Compact visuals and responsive layout tuned for professional readability.

(EC#1) Interactive Map Controls β€” Layer toggles for health, species search, and a one-click focus to District 4.
(EC#2) Maintenance & Risk Hooks β€” Interactive stacked chart + compact KPI table and narrative.

plotly::plot_ly(x = riskMaint$month, y = riskMaint$risk, type = "bar", name = "Risk cases") |>
  plotly::add_bars(y = riskMaint$orders, name = "Work orders") |>
  plotly::layout(barmode = "stack", title = "Maintenance & Risk (Mock Connector)", margin = list(t = 30))
kpi <- riskMaint |>
  dplyr::mutate(close_rate = ifelse(risk > 0, orders / risk * 100, NA_real_)) |>
  dplyr::mutate(close_rate = sprintf("%.1f%%", close_rate))
knitr::kable(kpi, align = "lrrr", caption = "Sample monthly KPI table (demo)")
Sample monthly KPI table (demo)
month risk orders close_rate
Jul 42 28 66.7%
Aug 51 34 66.7%
Sep 46 40 87.0%
Oct 49 44 89.8%

🌎 Conclusion

This project demonstrates how open spatial data and modern R tools can help inform sustainable urban forestry policies.

Through geospatial analysis and visualization, we reveal how NYC’s tree distribution reflects broader social and environmental patterns. Maintaining the green canopy is not just about planting more trees β€” it’s about data-driven planning, equity, and long-term ecosystem health.

πŸ“š Sources

NYC Street Tree Census (2015) β€” Open data provided by NYC Parks and Recreation, available via NYC Open Data Portal: https://data.cityofnewyork.us/Environment/2015-Street-Tree-Census-Tree-Data/5rq2-4hqu

NYC Council District Boundaries (v23a) β€” Geographic shapefiles published by the NYC Department of City Planning: https://www.nyc.gov/site/planning/data-maps/open-data/districts-download-metadata.page

NYC Open Data Portal β€” https://opendata.cityofnewyork.us/

Coordinate Reference System: EPSG:2263 – NAD83 / New York Long Island (ftUS)

Course Reference: STA 9750 β€” Data Mining for Business Analytics, Baruch College (CUNY)

Intro photo